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Plant species, including algae and fungi, are based on type specimens to which the name of a taxon is 
permanently attached. Applying a scientific name to any specimen therefore requires demonstrating 
correspondence between the type and that specimen. Traditionally, identifications are based on 
morpho-anatomical characters, but recently systematists are using DNA sequence data. These studies are 
flawed if the DNA is isolated from misidentified modern specimens. We propose a genome-based solution. 
Using 4X4 mm 2 of material from type specimens, we assembled 14 plastid and 15 mitochondrial genomes 
attributed to the red algae Pyropia perforata, Py. fucicola, and Py. kanakaensis. The chloroplast genomes 
were fairly conserved, but the mitochondrial genomes differed significantly among populations in content 
and length. Complete genomes are attainable from 19 lh and early 20 th century type specimens; this validates 
the effort and cost of their curation as well as supports the practice of the type method. 

The correct application of 18th, 19th, and early 20th century plant names to modern specimens is a challen- 
ging undertaking. Plant names, including algae and fungi 1 , are based on type specimens, the original 
specimens on which species names are based. These specimens are housed in approximately 3,400 official 
herbaria and maintained by more than 10,000 herbarium curators at museums and universities around the 
world 2 . Historically, to assign the correct names to modern collections, type specimens were borrowed for 
anatomical and morphological comparison. This approach however is fraught with problems, particularly for 
morphologically simple and/or variable species, e.g., most algae, fungi, and numerous land plants, or where type 
material is missing, fragmented, or lacks the vegetative, reproductive, or geographic information necessary for 
correspondence with modern collections. Compounding the problem is that many herbarium curators are 
reluctant, and sometimes hostile, to loan material for what is termed "destructive sampling", the extraction of 
DNA from a fragment of a type specimen. One of the currently accepted answers to this problem is to collect fresh 
specimens and perform phylogenetic analyses using standard species markers 3 5 . Another is to use modern DNA 
to develop representative barcodes of species 5,6 . The fundamental idea of the barcode is to create a database of 
comparable sequences that are used by researchers for species determination. A global Barcode of Life Database 
(BOLD) focusing on the barcode as well as the various online repositories (EMBL, GenBank, DDBJ) contain 
millions of submissions that serve this purpose. The major problem with these two approaches is the assumption 
that a barcode from any specimen said to be a particular species is truly representative of the type material of that 
species. The only indisputable method for linking a species name to type material is by sequencing type speci- 
mens 710 . This approach too has limitations. Specifically, usually only small (—200 base pairs) hypervariable 
regions of DNA can be obtained 11 , and therefore complete gene sequences required for phylogenetic analyses are 
not achievable. The age-old question still remains, how do scientists unite the alpha system of taxonomy to 
modern systematics? 
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To address this question we isolated DNA from small herbarium 
fragments (4X4 mm 2 ) of species in the economically important red 
algal genus Pyropia (Py.), recently segregated from Porphyra (Po.) 12 
and both marketed as nori as follows: 6 type specimens attributed to 
Py. perforata (J. Agardh) S.C. Lindstrom, 6 non-type specimens of 
Py. perforata distributed in the northeast Pacific from Washington to 
Baja California Sur, Mexico, 1 specimen from the type sheet of Py. 
perforata attributed to Py. kanakaensis (Mumford) S.C. Lindstrom, 
and the holotype collections of 2 northeastern Pacific species, Py. 
fucicola (V. Krishnamurthy) S.C. Lindstrom and P. kanakaensis 
(Fig. 1) (Table 1). The specimens ranged in age from 140 years old 
(collected in 1874) to recent (collected fresh). Included in this ana- 
lysis are the type specimens of two species (Po. perforata f. segregata 
Setchell and Hus and Po. sanjuanensis V. Krishnamurthy (Fig. 1)) 
considered distinct by some authors 13,14 , and conspecific with Py. 
perforata by others 15-16 . 

Results 

Quantitation and Data. High sensitivity quantitation of the DNA 
extractions indicated intact DNA fragments 35-500 base pairs in 
length (Fig. 1), with considerable variation in concentration 
between specimens (e.g. refer to Fig. If syntype material of Po. 
perforata f. segregata and Fig. lj lower specimen on the lectotype 
sheet of Py. perforata). Based on the fragmented nature of the 
DNA, the specimens were subjected to single end 36 bp Illumina 
next generation sequencing 17 . The number of filtered sequencing 
reads generated from the 15 specimens varied from 4,716,038 to 
68,784,178 (Table 2). The reads were sufficient to assemble the 
complete chloroplast genomes from 12 of the 15 specimens and 
the complete mitochondrial genomes for all 15 of the specimens, 
with the average N50 for all 15 specimens calculated to 25,274 bp, 
and the average maximum contig length to 54,472 bp (Table 2). Prior 
to analyzing all of the specimens, filtered reads from the first three 
type materials (LD-Ag 13037, UC 807662, VK-11-00061) were 
analyzed for bacterial and human contamination, and found to 
contain less than 0.75% contamination 18 . 

Chloroplast Genome Analysis. The chloroplast genomes of Py. 
perforata were similar in length (189,752 bp to 189,889 bp), 
content, and gene synteny, all containing 209 protein-coding genes 
(including 24 ycf and 27 Open Reading Frames (ORFs)), 35 tRNA, 3 
ribosomal RNA, totaling 247 genes (Supplementary Figures 1-5, 
Supplementary Table 1). The partial chloroplast genomes of Py. 
fucicola and Py. kanakaensis we generated account for 97.5% of the 
estimated complete genome length. The assembly methods we 
employed for these two holotypes were unable to resolve a region 
approximately 4.8 kb in length representing non-identical ribosomal 
16S, 23S, and 5S repeats. The content and synteny of Py. fucicola and 
Py. kanakaensis are similar to Py. perforata and other Pyropia 
species. 

Within populations of Py. perforata the chloroplast sequences 
were highly conserved. Two syntype collections of Po. perforata f. 
segregata from La Jolla, California were nearly identical (differing by 
1 SNP), as were two specimens from the lectotype sheet of Py. per- 
forata from San Francisco, California (6 SNPs, 4 gaps), and two 
syntype specimens of Py. perforata from Santa Barbara, California 
(4 SNPs) . Comparison of genomes between the type collections of Po. 
perforata f. segregata and Po. sanjuanensis, differed from the lecto- 
type of Py. perforata by 185 SNPs (+14 gaps), and 75 SNPs ( + 1 gap), 
respectively. The non-type material of Py. perforata from Punta San 
Roque, Baja California Sur showed the greatest amount of intraspe- 
cific sequence divergence from Py. perforata, 1,072 SNPs and 75 
gaps. Pairwise distances between specimens of Py. perforata ranged 
from 0.0000-0.0053 (Supplementary Table 2). Interspecific distances 
between Py. perforata and Py. haitanensis were lowest (0.1 178), and 
highest between Py. perforata and Py. fucicola (0.1453). 



Maximum likelihood analysis of the chloroplast genomes of 18 
complete sequences indicates strong support for a clade containing 
Py. perforata in a sister relationship to Py. haitanensis and Py. kana- 
kaensis (Fig. 2). The same relationships, but with less bootstrap sup- 
port, were found when a likelihood analysis was performed using 
only the rbcL gene from the same specimens (Fig. 2). Locally collinear 
blocks (LCBs) analysis of 12 chloroplast sequences against the pub- 
lished genomes of Pyropia (Py. yezoensis and Py. haitanensis)" and 
Porphyra (Po. purpurea and Po. umbilicalis) 20,21 identified 33 con- 
served gene regions using Cyanidium caldarium 22 as an outgroup. 
The data confirm that genome structure is highly conserved within 
the Bangiaceae (Fig. 3). The only apparent difference is that all speci- 
mens of Py. perforata contained three fewer non-identical ribosomal 
16S, 23S and 5S repeats (approximately 4.8 kb) compared to other 
Bangiaceae. 

Mitochondrial Genome Analysis. The mitochondrial genomes of 
specimens attributed to Py. perforata harbored 55 to 59 genes, with 
lengths ranging from 32,491 bp (Py. perforata from Carmel, 
California) to 40,042 bp (holotype of Po. sanjuanensis from San 
Juan Island, Washington) (Table 2, Supplementary Table 3). 
Specimens of Py. perforata contained 2-3 ribosomal RNA genes 
[1-2 large subunit (rnl), 1 small subunit (rns)], 23-24 transfer 
RNAs, 4 ribosomal proteins, 2 ymfs, and 18-19 genes involved in 
electron transport and oxidative phosphorylation. The number of 
ORFs varied between specimens (3 ORFs in Py. perforata from 
Carmel, California to 7 ORFs in the holotype of Po. sanjuanensis) 
(Supplementary Figures 6-17, Supplementary Table 3). The genome 
content of Py. fucicola was similar to Py. perforata, however Py. 
kanakaensis lacked orf546, but contained orf729. 

The mitochondrial genome sequences within populations of Py. 
perforata were similar. Two syntype collections of Po. perforata f. 
segregata from La Jolla, California were nearly identical (differing by 
5 SNPs, 2 gaps), as were the two specimens from the lectotype sheet 
of Po. perforata from San Francisco, California (4 SNPs, 2 gaps), and 
two syntype specimens of Py. perforata from Santa Barbara, 
California (3 SNPs). In contrast, the genomes of Py. perforata from 
different populations varied in their content and length. The type 
collections of Po. perforata f. segregata and Po. sanjuanensis differed 
from the lectotype ofPy. perforata by 120 SNPs ( + 8 single nucleotide 
gaps and 3 large gaps) and 106 SNPs ( + 3 single nucleotide gaps and 3 
large gaps), respectively. The specimen from Punta San Roque, Baja 
California Sur exhibited the greatest intraspecific variation compared 
to the lectotype of Py. perforata, showing 934 SNPs, 127 single/mul- 
tiple length gaps, and 1 large gap. Pairwise distances between speci- 
mens of Py. perforata ranged from 0.0000-0.0641 (Supplementary 
Table 4). Distances between the holotype of Py. kanakaensis and a 
more recent collection of this species from Land's End, San Francisco 
was 0.0039. Interspecific distances between Py. perforata and Py. 
fucicola were lowest (0.1963), and highest between Py. perforata 
and Py. yezoensis (0.3226). Distances between Py. yezoensis, Py. hai- 
tanensis, Py. kanakaensis, and Py. fucicola ranged from 0.1113- 
0.3499. 

Maximum likelihood analysis of the complete mitochondrial gen- 
omes found strong support for a single monophyletic clade contain- 
ing Py. perforata, which was sister is position to Py. haitanensis and 
Py. kanakaensis (Fig. 2). Phylogenetic analysis of the same represen- 
tatives using only their cytochrome oxidase 1 sequences (664 bp) 
failed to resolve the populations of Py. perforata, and found different 
relationships for the other species of Pyropia (Fig. 2). LCB analysis 
and linearized barcode alignments of the 15 Pyropia generated here, 
against those of published Pyropia and Porphyra 21,2 ^ 26 , identified 18 
conserved gene regions (Fig. 4). The alignments depict numerous 
insertion/deletion events among populations of Py. perforata, and 
between Py. perforata and other species of Pyropia. No alignment 
differences were observed within populations of Py. perforata, but 
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Figure 1 | Images of six type specimens analyzed in this study and their high sensitivity quantitations, (a), (d), Lectotype (LD-Ag 13037) of Porphyra 
perforata J. Agardh with 860.75 pg/ul at 98 bp. (b), (e), Holotype (VK-11-00061) of Po. sanjuanensis V. Krishnamurthy with 218.65 pg/ul at 74 bp. 
(c), (f), Syntype (UC 807662) of Po. perforata f. segregata Setchell & Hus with 3496.84 pg/ul measuring at 100 bp. (g), (j), Lower specimen on the lectotype 
sheet of Po. perforata, identified as Po. kanakaensis by Conway 29 with 5.21 pg/ul at 50 bp (LD-Ag 13038). (h), (k), Holotype (VK-1 1-00121) of 
Po. fucicola V. Krishnamurthy with 50.00 pg/ul at 150 bp. (i), (1), Holotype (Mumford #161) of Po. kanakaensis Mumford 24.10 pg/ul at 142 bp. FU = 
florescence units, bp = base pairs. 
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Table 1 Species, voucher, collection, and GenBank information for Pyropia analyzed in 


this study 
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Pyropia perforata 


1 IP 901 0009 
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Pyropia perforata 


VK-1 1-00061 , Po. sanjuanensis 
holotype 


V. Krishnamurthy/1 9-Feb-l 968/Minn. 
Reef, San Juan Is., Wash. 
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Pyropia kanakaens'is 


WTU 255 1 36, Py. kanakaensis 
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KJ708763 


KJ776836 
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holotype 


Bay, San Juan Is., Wash. 








Pyropia kanakaens'is 


UC 1863890 


R. Moe/1 2-Aug-l 999/Land's End, San 


KJ708765 


Not determined 


SAMN02743494 




Francisco, Calif. 








Pyropia fucicola 


VK-1 1-00121, Py. fucicola 
holotype 


V. Krishnamurthy/1 3-May-l 968/Makah 
Bay, Wash. 


KJ 708762 


KJ776837 


SAMN02743495 



significant polymorphisms were evident among populations of this 
species. Barcode findings were similar to those of the LCB analysis 
(Fig. 5). Most notably the intraspecific mitochondrial genome con- 
tent differences for Py. perforata were: 1) the lectotype and two other 
collections of Py. perforata (San Francisco and Baja California Sur) 
lack the entire 2,326 bp large subunit ribosomal intron present in 
other species of Pyropia, whereas some Py. perforata and Py. kana- 
kaensis both lack part (1,274 bp) of the same intron, 2) type material 
of Py. perforata contains a single orf546 gene, whereas the other 
specimens either have an additional non-identical orf546 repeat 
totaling 2,478 bp in size, or totally lack orf546 (Carmel, 
California), 3) Py. perforata from Santa Barbara and La Jolla lack a 
2,075 bp open reading frame (orf693) that is present in the other Py. 
perforata specimens and in other species of Pyropia, 4) Py. perforata 
from La Jolla, California codes for an additional tRNA (histidine), 
and 5) the holotype material of Po. sanjuanensis contains a 2,590 bp 
insertion that codes for a group II intronic open reading frame 
(orf813) not present in the other Py. perforata, but present in Py. 
haitanensis and Py. tenera. 

Phylogenetic Markers. Analysis of the standard chloroplast markers 
ribulose-l,5-bisphosphate carboxylase/oxygenase (rbcV) and the 
universal plastid amplicon (UP A), plus the universal mitochond- 
rial barcode marker cytochrome oxidase 1 (COl), found few 
polymorphisms (Supplementary Table 5) among populations of 
Py. perforata from Alaska, USA to Baja California Sur, Mexico. 
The rbcL gene for Py. perforata showed 0-2 (6) bp variation (the 
6 bp variation was exhibited solely in the specimen from Baja 
California Sur), and the lectotype sequence of Py. perforata was 
identical to three sequences deposited in GenBank from Alaska, 
USA and British Columbia, Canada; no differences for the UPA 
gene were observed among Py. perforata populations, and all 13 



genome sequences matched the two 371 bp sequences deposited in 
GenBank from British Columbia specimens; no polymorphisms were 
identified for COl between the lectotype and other Py. perforata, with 
the exception of the Py. perforata from Baja California Sur (which 
differed by 3 bp) and the holotype specimen of Po. sanjuanensis. The 
latter was found to contain orf813 inserted in the COl gene (Fig. 5). 
As noted above, this specific orf813 organization is also found in Py. 
haitanensis and Py. tenera. Comparison of COl sequences from the 
Py. perforata genomes to those in GenBank found 12 exact matches 
from specimens from British Columbia. Analysis of the holotype of 
Py. kanakaensis found an exact match in GenBank to the rbcL 
sequence generated from a specimen from British Columbia, and 
two exact matches for COl from specimens of Py. kanakaensis 
from the same province. The holotype of Py. fucicola failed to 
exactly match any sequences in GenBank for rbcL and UPA, but its 
COl barcode was identical to seven sequences deposited under the 
name Py. fucicola from British Columbia. 

Discussion 

The first plastid and mitochondrial genomes from red algae were 
determined for Porphyra purpurea 20 - 24 . The organellar genomes of 
other Bangiaceae soon followed 19 ' 21,23,25,26 . Excluding six red algal 
florideophyte chloroplast genomes and ten mitochondrial genomes, 
in total GenBank contains the complete circular genomes of two 
species of Porphyra (Po. purpurea and Po. umbilicalis), three 
Pyropia mitochondrial genomes (Py. yezoensis, Py. haitanensis, and 
Py. tenera), and two Pyropia chloroplast genomes (Py. yezoensis, Py. 
haitanensis). This study investigated genomic divergence at both the 
intraspecific and interspecific levels to test the current taxonomic 
classification of Py. perforata. We analyzed the type specimens of 
Po. perforata f. segregata and Po. sanjuanensis and compared the 
genetic distances exhibited by these specimens to two closely related 
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Table 2 Comparison of assembly and genomic data for the specimens 


of Pyropia analyzed in this study 






Species/Voucher/Year Collected 
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15 
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1 ,845 
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00 900 
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2 
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689 
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095 


613 
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Po. perforata f. segregata/UC 807662/1 895+ 
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4 
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189 
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20 
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27,347,099 


34 
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38,463 
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23 


010 


1,484 


51,361 
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-194 
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f denotes assemblies performed in Velvet using kmer=3 1 
*denotes assemblies performed in Velvet using kmer=25 
—estimate based on 97.5% of genome that was obtained 
Py. denotes Pyropia 
Po. denotes Porphyra 



species, Py.fucicola against Py. yezoensis 12 . The distances between the 
latter were calculated to 0.0338 for the chloroplast genome. The same 
comparison done for Po. purpurea and Po. umbilicalis, was 0.0833, 
well within the range observed for all Pyropia distances compared in 
this study (0.0338-0.1455). The range of divergence between the 
lectotype of Py. perforata and the types of Po. perforata f. segregata 
(0.0009), and Po. sanjuanensis (0.0004), fall well within that of all Py. 
perforata from Washington to Baja California Sur (0.0000-0.0053). 
It is thus concluded that this variation represents intraspecific vari- 
ation. Conversely, mitochondrial distances between Py. fucicola and 
Py. yezoensis, plus Py. fucicola and Py. tenera, were 0.1463 and 
0.1113, respectively. Pairwise distances between various Pyropia 
species were quite high (0.1113-0.3499). For Po. purpurea and Po. 
umbilicalis that number was 0.1567. The level of variation observed 



among populations ofPy. perforata was 0.0000-0.0641. Compared to 
the lectotype of Py. perforata, the types of Po. perforata f. segregata 
(0.0258) and Po. sanjuanensis (0.0224) fall within the observed 
intraspecific range. Based on these well-defined pairwise distances, 
the interspecific delineations using complete plastid evidence is likely 
around 0.025 and higher, and for the mitochondrial genome they are 
at 0.10 and higher. 

Analysis of standard markers 27 indicates that scant amounts of 
variation can be obtained through the marker approach compared 
to the genomic method of analysis. In comparing the chloroplast 
variation exhibited by the rbcL gene among populations of Py. per- 
forata, we found a mere 0-2 bp divergence, whereas, the genome data 
for these same specimens displayed 1 SNP- 1,072 SNPs and 75 gaps 
divergence. Interestingly enough, the maximum likelihood analysis of 
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Figure 2 | Maximum likelihood analysis of chloroplast genomes (a), rbcL sequences (b), mitochondrial genomes (c), and COl sequences (d) of Pyropia 
and Porphyra. Numbers above branches are maximum likelihood bootstrap values based on 1,000 replicates. The legend below represents the scale 
for nucleotide substitutions. The analysis was performed using RAxML and the default parameters in Galaxy 43 " 15 . The tree was constructed 
with TreeDyn 198.3 at Phylogeny.fr 46 . 
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Figure 3 | Locally collinear blocks (LCBs) analysis for 17 complete chloroplast genomes. The figure depicts linearized alignments identifying 33 
conserved gene regions for 14 specimens of Pyropia (a-n), two of Porphyra (o-p), and outgrouped with Cyanidium caldarium (q). Each chromosome is 
oriented horizontally, and homologous blocks are shown as identically colored regions linked across genomes. Regions that are inverted relative to Py. 
perforata are shifted below the genome's center axis. Sequence similarities within an LCB are proportional to the heights of interior colored bars. Large 
sections of white within blocks, and gaps between blocks, indicate lineage specific sequences. The analysis shows that chloroplast genomes of Pyropia and 
Porphyra are similar in content and gene synteny. The single observable difference is the presence of ribosomal 16S, 23S and 5S repeats found in other 
Pyropia and the two species of Porphyra, but absent in Py. perforata, (a-1), Py. perforata, (a), LD-Ag 13037 (b), VK-1 1-00061 (c),UC 807662 (d), LD-Ag 
13038 (e), UC 95739 (f), LD-Ag 13031 (g), UC 95735 (h), LD-Ag 13032 (i), UC 1450590 (j), UC 2019900 (k), UC 2019901 (1), UC 2019902 (m), Py. 
haitanensis (n), Py. yezoensis (o), Po. purpurea (p), Po. umbilicalis (q), Cyanidium caldarium. The figure was drawn using Mauve 2.3. 1 39 . 
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Figure 4 | Locally collinear blocks (LCBs) analysis for 20 complete mitochondrial genomes. The figure depicts linearized alignments identifying 18 
conserved gene regions for six species of Pyropia (a-r) and two of Porphyra (s-t). Each chromosome is oriented horizontally and homologous blocks are 
shown as identically colored regions linked across genomes. Regions that are inverted relative to P. perforata are shifted below the genome's center axis. 
Sequence similarities within an LCB are proportional to the heights of interior colored bars. Large sections of white within blocks, and gaps between 
blocks, indicate lineage specific sequences. The analysis shows that mitochondrial genomes within populations of Py. perforata are similar in content and 
length, but highly variable between populations and other Pyropia. (a-1), Py. perforata, (a), LD-Ag 13037 (b), LD-Ag 13038 (c), UC 95735 (d), LD-Ag 
13031 (e), LD-Ag 13032 (f), UC 2019900 (g), UC 2019901 (h), UC 2019902 (i), UC 1450590 (j), UC 807662 (k), UC 95739 (1), VK- 11-00061 (m), 
Mumford#161 (n), UC 1863890 (o), VK-1 1-00121 (p), Py. haitanensis (q), Py. yezoensis (r), Py. tenera (s), Po. purpurea (t), Po. umhilicalis. The figure was 
drawn using Mauve 2.3. 1 39 . 
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Figure 5 | Linearized barcode representation of 20 aligned complete mitochondrial genomes for six species of Pyropia (a-r) and two of Porphyra (s-t). 

Matching colors between rows represent similar DNA sequences, and blanks (white blocks) represent deletion events. The analysis shows that 
mitochondrial genomes within populations of Py. perforata are similar in content, but highly variable between populations and other Pyropia. Deletions 
of the two large ribosomal subunit introns (rnl intron and orf546), a large 2,075 bp ORF (orf693), and the insertion of a 2,590 bp ORF (orf813), as well as 
the insertion of orf729 distinguish populations and different species of Pyropia. (a-1), Py. perforata, (a), LD-Ag 13038 (b), LD-Ag 13037 (c), UC 1450590 
(d), VK-1 1-00061 (e), LD-Ag 13031 (f), LD-Ag 13032 (g), UC 2019900 (h), UC 2019902 (i), UC 2019901 (j), UC 95739 (k), UC 807662 (1), UC 95735 (m), 
VK- 11-00121 (n), Py. yezoensis (o), Py. tenera (p), UC 1863890 (q), Mumford#161 (r), Py. haitanensis(s), Po. purpurea (t), Po. umbilicalis. The figure was 
illustrated using lalview 41 . 



the rbcL data generated a congruent evolutionary hypothesis com- 
pared to the genome data phylogeny. The other chloroplast marker, 
UP A, failed to exhibit any polymorphisms in this species. The COl 
barcode showed 0-3 bp variation, whereas the genome data for these 
specimens found content, length (32,491 to 40,042 bp), and SNP 
variation (3 SNPs-934 SNPs, 127 single/multiple length gaps, and 
1 large gap). These results suggest that the marker based approach 
to phylogenetics is failing to identify a large amount of cryptic 
molecular diversity in these algae. Comparison of the COl phylogeny 
to the genome derived tree found incongruency. The COl data alone 
was unable to resolve populations of Py. perforata, and supported 
different relationships compared to the genome-based hypothesis. 

All of these results taken together, support previous taxonomic 
and phylogenetic conclusions regarding the synonymy of the names 
Po. perforata f. segregata and Po. sanjuanensis under Py. perfor- 
ata 1 ^ 16 . This species, although quite variable in its mitochondrial 
sequence between populations, is circumscribed to accommodate 
monostromatic thalli that inhabit the uppermost intertidal to the 
lower intertidal, that are variable in color with ruffled margins, vary 
in thickness from 40-60 um, are monoecious and reproduce sexually 
with tiers of zygotosporangia in 2 or 4 (mixed and not mixed with 
vegetative cells) and spermatangia in tiers of 8, but that also asexually 
reproduce via aplanospores, and show a karyotype of 2 or 3 13 ' 28 32 . 
One of the specimens that was analyzed, LD-Ag 13038 (Fig. lg), 
mounted on the lectotype sheet of Py. perforata, was previously 
attributed incorrectly to Py. kanakaensis based on anatomical exam- 
ination 29 . This specimen should be designated syntype material, 
especially in light of the fact that it is excessively perforate, and the 
sheet carries the inscription Porphyra perforata in the author's (J. 
Agardh's) handwriting. The other specimen that was misidentified as 
Py. perforata (UC 1863890 from Land's End, San Francisco, 



California), was determined by mitochondrial genome and partial 
plastid analysis to be assignable to Py. kanakaensis. 

Worldwide herbaria are estimated to contain 300 million speci- 
mens and nearly all of them are not being used for molecular phylo- 
genetic studies. Of the estimated 70,000 plant species still to be 
described, more than half already have been collected and are stored 
in herbaria 33 . In an age when administrators of universities are cutting 
funds or considering closure of herbaria on the grounds of obsol- 
escence, there is a need for a method that will allow for type and non- 
type specimens to be compared against existing older names, as well 
as future names. Our data show that this need can be satisfied using 
very small amounts of archival herbarium tissue. The methodologies 
used here are optimized for low DNA quality and concentration for 
library construction (several of the samples contained less than 0.5 ng 
of total DNA). The amount of material required for this type of 
analysis is similar to that traditionally used for microscopic examina- 
tion. In addition, our results show that large amounts of single read 
sequence data are not required to decipher the chloroplast and mito- 
chondrial genomes. In this case we assembled the two circular gen- 
omes of the specimen of Py. perforata from Baja California Sur with 
only 4,716,038 filtered reads. Once deciphered, the large amount of 
information housed in the chloroplast and mitochondrial genomes 
likely eliminates the need for future sampling of the type material for 
organellar purposes. The complete circular genomes of type speci- 
mens can be used in part (i.e. markers) or in total, to address barcode, 
phylogenetic, conservation, taxonomic, historical, evolutionary, and 
population studies. This data shows that 19 th and early 20 ,h century 
herbarium specimens have great value for current and future system- 
atic and genomic studies, and with respect to type specimens, are 
essential for the accurate application of species names for all plants, 
algae and fungi where ample material was archived. 
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Methods 

DNA was isolated following the protocol of Lindstrom et al. 9 , with the following 
exception: nucleic acids were resuspended with 60 ul of elution buffer. The extrac- 
tions were performed using 4X4 mm 2 of material following the precautionary 
contamination guidelines outlined by Hughey and Gabrielson 11 . The DNA quality 
and quantity was analyzed by the High- Throughput Genomics Center (HTGC) on an 
Agilent 2100 Bioanalyzer 1M following the manufacturer's instructions. The genome 
library was constructed based on a modified TruSeq protocol developed by HTGC 
(Supplementary Methods). The 36 bp single end sequencing analysis was performed 
using the manufacturer's protocol via the cBot and HiSeq 2000 by HTGC. Filtered 
reads were base called using Illumina's standard pipeline, then assembled using the 
Bio-Linux 7 34 platform with Velvet 35 running on auto settings. After the first run, the 
data was then rerun optimizing for the expected cutoff and coverage cutoff based on 
the coverage data from the first assembly. Specimens with more than 15 million reads 
were assembled using the kmer — 31, while those with less than 8 million were 
assembled with kmer — 25. The resulting contigs were searched at NCBI using 
Megablast, then aligned contigs were ordered according to reference sequences (Py. 
yezoensis, Py. haitanensis, and Po. purpurea). To validate the joined contigs, targeted 
PCR and sequencing, and assembly comparisons to Metavelvet 36 contig results, were 
analyzed on the first three genomes assembled (LD-Ag 13037, UC 807662, VK-11- 
00061). Genomes processed later were confirmed by aligning sequence reads against a 
draft assembly in NextGENe® (SoftGenetics LLC). The ORFs were annotated using 
NCBI ORF-finder and alignments obtained via BLASTX and BLASTN searches at 
NCBI. The tRNAs were identified using the tRNAscan-SE 1.21 web server 37 and the 
rRNAs using the RNAmmer 1.2 server 38 . LCB alignments were generated using 
ProgressiveMauve 39 with a seed of 21 for the chloroplast and mitochondrial align- 
ments, with the 'Use seed families' option selected. The barcode alignment of the 
mitochondrial data was performed with MAFFT 7.0581 40 using default settings, and 
the results were presented with Jalview 41 . Alignments results from MAFFT were 
analyzed with RaxML 42 using the default parameters in Galaxy 43-45 , and the phylo- 
genetic tree was visualized with TreeDyn 198.3 at Phylogeny.fr 46 . Pairwise distances 
were calculated using the default settings (GTR substitution model) by DIVEIN 47 . 
Deconseq analysis to determine human and bacterial contaminant percentages was 
analyzed against the following: Human- Reference GRCh37, 57,317 unique 18S 
sequences, and 2,206 unique bacterial genomes at the 90-94% default settings. 
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